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The wide availability of user-provided content in online social me¬ 
dia facilitates the aggregation of people around common interests, 
worldviews, and narratives. Despite the enthusiastic rhetoric on the 
part of some that this process generates "collective intelligence”, 
the WWW also allows the rapid dissemination of unsubstantiated 
conspiracy theories that often elicite rapid, large, but naive social 
responses such as the recent case of Jade Helm 15 - where a sim¬ 
ple military exercise turned out to be perceived as the beginning of 
the civil war in the US. We study how Facebook users consume in¬ 
formation related to two different kinds of narrative: scientific and 
conspiracy news. We find that although consumers of scientific and 
conspiracy stories present similar consumption patterns with respect 
to content, the sizes of the spreading cascades differ. Homogeneity 
appears to be the primary driver for the diffusion of contents, but 
each echo chamber has its own cascade dynamics. To mimic these 
dynamics, we introduce a data-driven percolation model on signed 
networks. 

misinformation | rumor spreading | collective narratives | crowd dynamics | 
online social media 

T he massive diffusion of socio-technical systems and mi¬ 
croblogging platforms on the WWW creates a direct path 
from producers to consumers of content, i.e., allows disinter¬ 
mediation, and changes the way users become informed, de¬ 
bate, and form their opinions mu 13 HE]. This disinterme¬ 
diated environment can foster confusion about causation, and 
thus encourage speculation, rumors, and mistrust [B]- In 2011 
a blogger claimed that global warming was a fraud designed 
to diminish liberty and weaken democracy [?]■ Misinformation 
about the Ebola epidemic has caused confusion among health¬ 
care workers [S]- Recent research mnnun] has shown that 
increasing the exposure of users to unsubstantiated rumors 
increases their tendency to be credulous. 

According to Ref. m, beliefs formation and revision is in¬ 
fluenced by the way communities attempt to make sense to 
events or facts. Such a phenomenon is particularly evident on 
the WWW where users, embedded in homogeneous clusters 
|13l 1141 [T5| . process information through a shared system of 
meaning miiQ]. 

Here we analyze the cascade dynamics of Facebook users 
when the content is (i) conspiracy theories and (ii) scientific 
information. Conspiracy theories simplify causation, reduce 
the complexity of reality, and contain uncertainty |1611171118] . 
Scientific information disseminates scientific advances and ex¬ 
hibits the process of scientific thinking. The main difference 
between the two is content verifiability. The generators of sci¬ 
entific information and their data, methods, and outcomes are 
readily identifiable and available. The origins of conspiracy 
theories are often unknown and the content of the theories 
is strongly disengaged from mainstream society and sharply 
divergent from recommended practices [19], e.g., belief that 
vaccinations cause autism. 

Massive digital misinformation is becoming pervasive in on¬ 
line social media to the extent that it has been listed by the 
World Economic Forum (WEE) as one of the main threats 
to our society ESj. To counteract this trend, algorithmic- 
driven solutions have been proposed |2111221123112411251126| , 
e.g., Google ^ is developing a trustworthiness score to rank 


the results of queries. Similarly, Facebook has proposed a 
community-driven approach where users can flag false contents 
to correct the news-feed algorithm. This issue is controversial, 
however, because it raises fears that the free circulation of 
content may be threatened and the proposed algorithms not 
be accurate or effective jinniEH]. Often conspiracists will 
denounce attempts to debunk false information, e.g., the link 
between vaccination and autism, as acts of misinformation. 

Whether a claim (either substantiated or not) is accepted 
by an individual is strongly influenced by social norms and the 
claim’s coherence with the individual’s belief system |29l l30| . 
Despite some enthusiastic claims about the growth of a col¬ 
lective intelligence EH, many mechanisms animate the flow of 
false information that generates false beliefs in an individual, 
which, once adopted, are rarely corrected |32l 1331 [34l I35| . 

We use quantitative analysis to show that homogeneity is 
the primary driver of content diffusion and generates the for¬ 
mation of homogeneous, polarized clusters, i.e., “echo cham¬ 
bers” [9| 11011361137 ] . We also find that although consumers 
of scientific information and conspiracy theories exhibit simi¬ 
lar consumption patterns with respect to content, the cascade 
patterns of the two differ. Homogeneity appears to be the 
preferential driver for the diffusion of content, yet each echo 
chamber has its own cascade dynamics. 

The paper is structured as follows. First we provide the pre¬ 
liminary definitions and details concerning data collection. We 
then do a comparative analysis and characterize the statisti¬ 
cal signatures of the cascades of the different kinds of content. 
Finally, we introduce a data-driven model that replicates the 
analyzed cascade dynamics. 

Methods 

Ethics Statement. The data collection process has been car¬ 
ried out using the Facebook Graph API EH], which is publicly 


Significance 

SIGNIFICANCE: Using a massive quantitative analysis of Face- 
book, we show that information related to very specific nar¬ 
ratives - conspiracy theories and scientific news - generates 
homogeneous and polarized communities that have similar in¬ 
formation consumption patterns. To account for these features 
we derive a data-driven percolation model of rumor spreading 
that demonstrates that homogeneity and polarization are the 
main determinants for predicting cascade size. 


Reserved for Publication Footnotes 


Miao I SC I SC I SC I 1-6 


© 


© 


© 


© 












‘Echo-ArXiv” — 2015/12/22 — 5:43 — page 2 — #2 


available. For the analysis (according to the specification set¬ 
tings of the API) we only used publicly available data (thus 
users with privacy restrictions are not included in the dataset). 
The pages from which we download data are public Facebook 
entities and can be accessed by anyone. User content con¬ 
tributing to these pages is also public unless the user’s privacy 
settings specify otherwise, and in that case it is not available 
to us. 

Data collection. Debate about social issues continues to ex¬ 
pand across the Web, and unprecedented social phenomena 
such as the massive recruitment of people around common in¬ 
terests, ideas, and political visions are emerging. Using the 
approach described in Ref. [9], we define the space of our in¬ 
vestigation with the support of diverse Facebook groups that 
are active in the debunking of conspiracy theories. 

The resulting dataset is composed of 67 public pages divided 
between conspiracy and science news. A second set, composed 
of two troll pages, is used as a benchmark to fit our data-driven 
model. The first category (conspiracy theories) includes the 
pages that disseminate alternative, controversial information, 
often lacking supporting evidence and frequently advancing 
conspiracy theories. The second category (science news) in¬ 
cludes the pages that disseminate scientific information. The 
third category (trolls) includes those pages that intentionally 
disseminate sarcastic false information on the Web. 

For the three sets of pages we download all the posts (and 
their respective user interactions) across a five-year timespan 
(2010 to 2014). We perform the data collection process by us¬ 
ing the Facebook Graph API [38], which is publicly available 
and accessible through any personal Facebook user account. 
The exact breakdown of the data is presented in the Support¬ 
ing Information (SI) Section 1. 

Preliminaries and Definitions. A tree is an undirected simple 
graph that is connected and has no simple cycles. An oriented 
tree is a directed acyclic graph whose underlying undirected 
graph is a tree. A sharing tree in the context of our research 
is an oriented tree made up of the successive sharing of a news 
item through the Facebook system. The root of the sharing 
tree is the node that performs the first temporal share. We 
define the size of the sharing tree as the number of nodes (and 
hence the number of news sharers) in the tree and the height 
of the sharing tree as the maximum path length distant from 
the root. 

We define the user polarization a = 2 q—1, where 0 < < 1 

is the fraction of “Likes” a user executes on conspiracy related 
content, and hence — 1 < cr < 1. From user polarization, we 
define the edge homogeneity, for any edge dj between nodes 
i and j, as 

Oij = (Tiaj, 

with — 1 < (Jij < 1. Fdge homogeneity reflects the similarity 
level between the polarization of the two sharing nodes. A link 
in the sharing tree is homogeneous if its edge homogeneity is 
positive, otherwise it is non homogeneous. We then define a 
sharing path to be any path from the root to one of the leaves 
of the sharing tree. A homogeneous path is a sharing path 
for which the edge homogeneity of each edge is positive, i.e., 
a sharing path whose edges are all homogeneous links. 

Wald Test. We use the Wald test to compare the scaling 
parameters of two power law distributions. We define it as 

Hq : qi = 02 

: di 7^ 02 


where di and 02 are the estimated scaling parameters. The 
Wald statistic: 

Uar(di) 

follows a distribution with one degree of freedom. We reject 
the null hypothesis Ho and conclude that there is a significant 
difference between the two scaling parameters if the p-value of 
W is below a given significance level a. 

Kolmogorov-Smirnov Test. We use the Kolmogorov- 
Smirnov test to compare the empirical distribution functions 
of two samples. The Kolmogorov-Smirnov statistic for two 
given cumulative distribution functions 1 ^ 1 ( 2 :) and ^ 2 ( 2 :) is 

D = sup |Fi( 2 :) - F’ 2 ( 2 :)|, 

X 

which measures the maximum punctual distance between 
the two samnle distributions. If D is bigger than a given criti¬ 
cal value Dojjwe reject the null hypothesis Hq : Fi{x) = ^ 2 ( 2 ;) 
and conclude that there is a significant difference between the 
two sample distributions. 

Results and discussion 

Anatomy of Cascades. We begin our analysis by characterizing 
the statistical signature of cascades as they relate to informa¬ 
tion type. We analyze the three types—science news, conspir¬ 
acy rumors, and trolling—and find that size and maximum 
degree are power-law distributed for all three. The maximum 
cascade size values are 952 for science news, 2422 for conspir¬ 
acy news, and 3945 for trolling, and the estimated exponents 
for the power law distributions are 2.21 for science news, 2.47 
for conspiracy theories, and 2.44 for trolling. Tree height val¬ 
ues range from 1 to 5, with a maximum height of 5 for science 
news and conspiracy theories and a maximum height of 4 for 
trolling. For further information see SI Section 2.1. 

Figure [U shows the probability density function (PDF) of 
the casca^ lifetime (using hours as time units) for science 
and conspiracy. We compute the lifetime as the length of 
time between the first user and the last user sharing a post. 
In both categories we find a first peak at approximately 1-2 
hours and a second at approximately 20 hours, indicating that 
the temporal sharing patterns are similar irrespective of the 
difference in topic. We also find that a significant percentage 
of the information diffuses rapidly (24.42% of the science news 
and 20.76% of the conspiracy rumors diffuse in less than two 
hours, and 39.45% of science news and 40.78% of conspiracy 
theories in less than five hours). Only 26.82% of the diffusion 
of science news and 17.79% of conspiracy lasts more than one 
day. Kolmogorov-Smirnov test made us reject the hypothesis 
Ho that the two distributions are equal. 

Figure [^ shows lifetime as a function of cascade size. For 
science news we have a peak in the lifetime corresponding to 
a cascade size value of « 200, and higher cascade size values 
correspond to high lifetime variability. For conspiracy related 
content the lifetime increases with cascade size. 

These results suggest that news assimilation differs accord¬ 
ing to category. Science news is usually assimilated, i.e., it 
reaches a higher level of diffusion, quickly, and a longer lifetime 
does not correspond to a higher level of interest. Conversely, 


^The critical value depends on the sample sizes and on the considered significance level a, it 
can be computed as 


where ni and n2 are the respective sample sizes and c(q:) is a fixed value associated with the 
significance level a. 
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conspiracy rumors are assimilated more slowly and show a 
positive relation between lifetime and size. For both science 
and conspiracy news, we compute size as a function of life¬ 
time and confirm that differentiation in the sharing patterns 
is content-driven, and that for conspiracy there is a positive 
relation between size and lifetime. For a more detailed expla¬ 
nation, see SI Section 2.1. 

Homogeneous Clusters. We next examine the social determi¬ 
nants that drive sharing patterns and we focus on the role of 
homogeneity in friendship networks. 

Figure shows the PDF of the mean edge homogeneity, 
computed for all cascades of science news and conspiracy the¬ 
ories. It shows that there are homogeneous links between con¬ 
secutively sharing users. In particular, the average edge homo¬ 
geneity value of the entire sharing cascade is always greater or 
equal to zero, indicating that either the information transmis¬ 
sion occurs inside homogeneous clusters in which all links are 
homogeneous or it occurs inside mixed neighborhoods in which 
the balance between homogeneous and non homogeneous links 
is favorable towards the former ones. However, the probability 
of close to zero mean edge homogeneity is really small. 

To further characterize the role of homogeneity in shaping 
sharing cascades, we compute cascade size as a function of 
mean edge homogeneity for both science and conspiracy news, 
see Figure]^ In science news, higher levels of mean edge ho¬ 
mogeneity in the interval (0.5, 0.8) correspond to larger cas¬ 
cades, but in conspiracy theories lower levels of mean edge 
homogeneity (~ 0.25) correspond to larger cascades. Notice 
that, although viral patterns related to distinct contents differ, 
homogeneity is clearly the driver of information diffusion. In 
other words, different contents generate different echo cham¬ 
bers, characterized by the high level of homogeneity inside 
them. 

The probability density function (PDF) of the edge homo¬ 
geneity, computed for science and conspiracy news as well 
as the two taken together—both in the unconditional case 
and in the conditional case (in the event that the user that 
made the first share in the couple has a positive or negative 
polarization)—confirms the roughly null probability of a neg¬ 
ative edge homogeneity (see SI Section 2.1). 

We record the CCDF of the number of all sharing path^on 
each tree compared with the CCDF of the number of homo¬ 
geneous paths for science and conspiracy news, and the two 
together. A Kolmogorov-Smirnov test and Q-Q plots con¬ 
firm that for all three pairs of distributions considered there 
is no significant statistical difference (see SI Section 2.2 for 
a more detailed analysis). In SI Section 2.2 we report also 
the frequency of maximum length for all sharing paths and 
homogeneous paths, for both categories of content. 

We confirm the pervasiveness of homogeneous paths, but 
we also find homogeneous paths in which there is a shift of 
— 1 in the path length (with respect to the total path length 
k). Notice that the first publisher of a news is generally a 
page, hence the (k — l)-homogeneous paths are due to a dis¬ 
cordant sharing in the first step (i.e., when the product of the 
first sharer’s user polarization and the sharer page category is 
negative). 

Cascade lifetimes of science and conspiracy news exhibit a 
probability peak in the first two hours, and that in the fol¬ 
lowing hours they rapidly decrease. Despite the similar con¬ 
sumption patterns, cascade lifetime expressed as a function 
of cascade size differs greatly for the different content sets. 
The PDF of the mean edge homogeneity indicates that there 
is homogeneity in the linking step of sharing cascades. The 
distribution of the number of total and homogeneous sharing 
paths are very similar for both content categories. 


Viral patterns related to contents belonging to different nar¬ 
ratives differ, but homogeneity is clearly the driver of content 
diffusion. 

The Model. We now introduce a percolation model of rumor 
spreading to account for homogeneity and polarization. We 
consider n users connected by a small-world network [39]. 
The model parameter space varies on a rewiring probability r, 
mimicking the network density, and a news set of size m. 

Fvery node has an opinion uJi, i € [l,u] uniformly dis¬ 
tributed in [0,1]. Fvery news item has a fitness (degree of 
interest) 1 ?^, j £ [l,m] uniformly distributed in [0,1]. At each 
step the news items are diffused and initially shared by a group 
of first sharers. After the first step, the news recursively passes 
to the neighborhoods of previous step sharers, e.g., those of 
the first sharers during the second step. If a friend of the pre¬ 
vious step sharers has an opinion close to the fitness of the 
news, then she shares the news again. 

In particular, when 


1 < 5, 

user i shares news /; 5 is the sharing threshold. 

Because <5 by itself cannot capture the homogeneous clusters 
observed in the data, we model the connectivity pattern as a 
signed network [U |40| considering different fractions of homo¬ 
geneous links and hence restricting diffusion of news only to 
homogeneous links. We define (f>HL as the fraction of homo¬ 
geneous links in the network, M as the number of total links, 
and Uh as the number of homogeneous links, thus we have: 

(l>HL = 0 < rih < M. 

Notice that 0 < (puL < 1 and that 1 — 4>hl, the fraction of 
non homogeneous links, is complementary to cpHL- In partic¬ 
ular, we can reduce the parameters space to <j>HL € [0.5,1] 
as we would restrict our attention to either one of the two 
complementary clusters. 

The model can be seen as a branching process where the 
sharing threshold S and neighborhood dimension 2 are the key 
parameters. More formally, let the fitness of the news 
and the opinion Wi of a the user be uniformly i.i.d. between 
[0,1]. Then the probability p that a user i shares a post j is 
defined by a probability p — min(l, 0-|-(5) — max(0,6^ — <5) « 25, 
since 6 and u) are uniformly i.i.d. In general, if u) and 6 have 
distributions /(oj) and fiO), then p will depend on 9, 


Pe 


r 

'( 0 ) / 
J IT 


inin(l,0 + (5) 
max(O,0 —5) 


/ (^) dw. 


If we are on a tree of degree z (or on a sparse lattice of degree 
2 -I- 1), the average number of sharers (the branching ratio) is 
defined by 

pL = zp Ki 25 z 

with a critical cascade size S' = (1 — ■ If we assume that 

the distribution of the number m of the first sharers is / (m), 
then the average cascade size is 


S p) = y— 

i /i 
m 


{m)j 

1 - 25z 


where {,■■■) j = "llm ■ ■ ■ f average with respect to 

/. In the simulations we fixed neighborhood dimension z = 8 


^Recall that a sharing path is here defined as any path from the root to one of the leaves of the 
sharing tree. A homogeneous path is a sharing path for which the edge homogeneity of each edge 
is positive 
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since the branching ratio ^ depends upon the product of a and 
5 and, without loss of generality, we can consider the variation 
of just one of them. 

If we allow a probability q that a neighbor of a user has 
a different polarization, then the branching ratio becomes 
jj, = z{l — q)p. If a lattice has a degree distribution d{k) 
(k = a + 1), we can then assume a usual percolation process 
that provides a critical branching ratio and that is linear in 
{^ )d/ (i-a)p(2^)/(2))- 

Simulation Results. We explore the model parameters space 
using n = 5,000 nodes and m = 1,000 news items with the 
number of first sharers distributed as an (i) inverse Gaussian, 
(ii) log normal, (iii) Poisson, (iv) uniform distribution, and 
as the real data distribution (from the science and conspiracy 
news sample). Parameters are chosen to fit the real data dis¬ 
tribution (for details see SI Section 3.1, 3.2). In Tablewe 
show a summary of relevant statistics (min value, first quan¬ 
tile, median, mean, third quantile, and max value) to compare 
the real data first sharers distribution with the fitted distribu¬ 
tion^ The inverse Gaussian (IG), shows the best fit for the 
distribution of first sharers with respect to all the considered 
statistics. 

Along with the first sharers distribution, we vary the shar¬ 
ing threshold 5 in the interval [0.01,0.05] and the fraction of 
homogeneous links (j>HL in the interval [0.5,1]. To avoid bi¬ 
ases induced by statistical fluctuations in the stochastic pro¬ 
cess, each point of the parameter space is averaged over 100 
iterations. (f>HL ~ 0.5 provides a good estimate of real data 
values. In particular, consistently with the division of in two 
echo chambers (science and conspiracy), the network is di¬ 
vided into two clusters in which news items remain inside and 
are transmitted solely within each community’s echo chamber 
(see SI Section 3.2 for the details of the simulation results). 

In addition to the science and conspiracy content sharing 
trees, we downloaded a set of 1,072 sharing trees of intention¬ 
ally false information from troll pages. Frequently troll in¬ 
formation, e.g., parodies of conspiracy theories such as chem- 
trails containing the active principle of Viagra, is picked up 
by habitual conspiracy theory consumers. In SI Section 3.2 
we report the same information as Table for trolling cate¬ 
gory. Also in this case we notice that the best fit is obtained 
by the inverse Gaussian distribution. 

We computed the mean and standard deviation of size and 
height of all trolling sharing trees, and reproduced the data 
using our models We used fixed parameters from trolling 
messages sample (the number of nodes in the system and the 
number of news items) and varied the fraction of homogeneous 
links the rewiring probability r, and sharing threshold 5. 
See SI Section 3.2 for the distribution of first sharers used and 
for additional simulation results of the fit on trolling messages. 

We simulated the model dynamics with the best combina¬ 
tion of parameters obtained from the simulations and the num¬ 
ber first sharers distributed as an inverse Gaussian, figure 
shows the GCDF of size and the GDF of height. A summary 
of relevant statistics (min value, first quantile, median, mean, 
third quantile, and max value) to compare the real data size 
and height distributions with the fitted ones is reported in SI 
Section 3.2. We notice that the fit is good for all the statis¬ 
tics, with the exception of min and max value of size. For 
the min value, the presence of a zero is due to the fact that 


■^For details on the parameters of the fitted distributions used see SI Section 3.2. 

“^Note that the real data values for the mean (and standard deviation) of size and height on the 
troll posts are respectively: 23.54 (122.32) and 1.78 (0.73). 

®The best parameters combinations is 4>hl — 0.56, r = 0.01, 5 = 0.015. in this case we 
have a mean size equal to 23.42 (33.43) and a mean height 1.28 (0.88), and it is indeed a 
good approximation, see Section 3.2. 


the inverse Gaussian is a real valued distribution function and 
in the simulations we considered the integer part of the num¬ 
ber of first sharers, thus producing a number of never shared 
pieces of information. On the other hand, the high difference 
in the max value is probably due to the long tail of the data 
size distribution. 

We find that the inverse Gaussian is the distribution that 
best fits the data results both for science and conspiracy news, 
and for troll messages. For this reason, we performed one more 
simulation using the inverse Gaussian as distribution of the 
number of first sharers, 1,072 news items, 16,889 users, and 
the best parameters combination obtained in the simulations 
The CCDF of size and the CDF of height for the above 
parameters combination, as well as basic statistics considered, 
fit the real data ones from the trolling category. 

Conclusions 

Digital misinformation has become so pervasive in online social 
media that it has been listed by the World Economic Forum 
(WEF) as one of the main threats to human society. Whether 
a news item, either substantiated or not, is accepted as true 
by a user may be strongly affected by social norms or by how 
much it coheres with the user’s system of beliefs |29II30| . De¬ 
spite enthusiastic claims that social media is generating a vast 
“collective intelligence” available to all m, many mechanisms 
cause false information to gain acceptance, which in turn gen¬ 
erate false beliefs that, once adopted by an individual, are 
highly resistant to correction [321 1331 135] . Using extensive 

quantitative analysis we show that social homogeneity is the 
primary driver of content diffusion, and one frequent result is 
the formation of homogeneous, polarized clusters (often called 
“echo chambers”). We also find that although consumers of 
science news and conspiracy theories show similar consump¬ 
tion patterns with respect to content, their cascades differ. 
Social homogeneity appears to be the primary driver of con¬ 
tent diffusion, and each echo chamber has its own cascade dy¬ 
namics. To mimic these dynamics, we introduce a data-driven 
percolation model of signed networks, i.e., networks composed 
of signed edges. Our analysis shows that for science and con¬ 
spiracy news a cascade’s lifetime has a probability peak in the 
first two hours followed by a rapid decrease. Although the con¬ 
sumption patterns are similar, cascade lifetime as a function of 
the size differs greatly. The PDF of the mean edge homogene¬ 
ity indicates that homogeneity is present in the linking step 
of sharing cascades. The distribution of the number of total 
sharing paths and homogeneous sharing paths are similar in 
both content categories. Viral patterns related to distinct con¬ 
tents are different but homogeneity drives content diffusion. 
We simulate our data-driven percolation model by fixing the 
number of users and news items downloaded from troll pages 
and varying the other parameters. We compare the simulated 
results with the data and find a high level of similarity. 
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Fig. 1. Probability density function (PDF) of Lifetime computed on science news and 
conspiracy theories, where the lifetime is here computed as the temporal distance (in hours) 
between the first and last share of a post. Both categories show a similar behavior, with a peak 
in the first two hours and another around 20 hours. 
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Fig. 2. Lifetime as a function of the cascade size for conspiracy news (left) and science 
news (right). We note a contents-driven differentiation in the sharing patterns. For conspiracy 
the lifetime grows with the size, while for science news there is a peak in the lifetime around a 
value of the size equal to 200, and a higher variability in the lifetime for larger cascades. 



Fig. 3. Mean edge homogeneity for science (solid orange) and conspiracy (dashed blue) 
news. The mean value of edge homogeneity on the whole sharing cascades is always greater or 
equal to zero. 
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Fig. 4. Cascade size as a function of mean edge homogeneity for science (solid orange) 
and conspiracy (dashed blue) news. 



Fig. 5. Complementary cumulative distribution function (CCDF) of size (left) and cumu¬ 
lative distribution function (CDF) of height (right) for the best parameters combination that 
fits troll data values, {(j)HL — (0.56,0.01,0.015), and first sharers distributed as 

/G(18.73, 9.63). We note that it is indeed a good fit of trolling data. 


Table 1. Summary of relevant statistics for 
the first sharers distributions. 



Data 

\G 

LN 

Poi 

Min 

1 

0.36 

0.10 

20 

1st Qu. 

5 

4.16 

3.16 

35 

Median 

10 

10.45 

6.99 

39 

Mean 

39.3 

39.28 

13.04 

39.24 

3rd Qu. 

21 

31.59 

14.85 

43 

Max 

3033 

1814 

486.10 

66 

























